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Abstract 

Background: This study describes how the complete mitogenome of a terrestrial snail, Cylindrus obtusus 
(Draparnaud, 1805) was sequenced without PCRs from a collection specimen that had been in 70% ethanol for 8 
years. The mitogenome was obtained with lllumina GAIIx shot gun sequencing. Although the used specimen was 
collected relatively recently and kept in a DNA-friendly preservative (not formalin as frequently used with old 
museum specimens), we believe that the exclusion of PCRs as facilitated by NGS (Next Generation Sequencing) 
removes a great obstacle in DNA sequencing of collection specimens. A brief comparison is made between our 
lllumina GAIIx approach and a similar study that made use of the Roche 454-FLX platform. 

Results: The mtDNA sequence of C. obtusus is 14,610 bases in length (about 0.5 kb larger than other 
stylommatophoran mitogenomes reported hitherto) and contains the 37 genes (13 protein coding genes, two 
rRNAs and 22 tRNAs) typical for metazoans. Except for a swap between the position of tRNA-Pro and tRNA-Ala, the 
gene arrangement of C. obtusus is identical to that reported for Cepaea nemoralis. The 'aberrant' rearrangement of 
tRNA-Thr and COIII compared to that of other Sigmurethra (and the majority of gastropods), is not unique for C 
nemoralis (subfamily Helicinae), but is also shown to occur in C. obtusus (subfamily Ariantinae) and might be a 
synapomorphy for the family Helicidae. 

Conclusions: Natural history collections potentially harbor a wealth of information for the field of evolutionary 
genetics, but it can be difficult to amplify DNA from such specimens (due to DNA degradation for instance). 
Because NGS techniques do not rely on primer-directed amplification (PCR) and allow DNA to be fragmented 
(DNA gets sheared during library preparation), NGS could be a valuable tool for retrieving DNA sequence data 
from such specimens. A comparison between lllumina GAIIx and the Roche 454 platform suggests that the former 
might be more suited for de novo sequencing of mitogenomes. 




Genomics 



Background 

Although NGS techniques advanced rapidly over the last 
years and sequencing of entire mitochondrial genomes 
(mitogenomes) has consequently become more common 
[1-3], knowledge about stylommatophoran ('terrestrial 
snails') mitogenomes seems to have advanced at a slower 
pace. For over two decades, the complete mitogenomes 
of only two stylommatophorans, Cepaea nemoralis [4] 
and Albinaria caerulea [5], had been known. Of a third 
species, Euhadra herklotsi, most of the mitogenome has 
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been covered, so comparisons of mitochondrial gene 
arrangements could be made [6-8], but a complete 
sequence for that mitogenome is still missing [6,9,10]. 
Sequencing mitogenomes has been quite cumbersome 
because enrichment of the mitochondrial fraction (e.g. 
physical isolation of mitochondria, cloning of large mito- 
chondrial fragments, long range, simplex or multiplex 
PCR) was quite laborious [1-3,8,11-15]. Moreover, the 
throughput of traditional (Sanger) sequencing is limited 
and sequencing of larger (> 1 kb) fragments is often 
delayed by the development of internal primers ('primer 
walking'). With NGS technologies, primer-directed 
amplification is no longer necessary and genome 
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sequencing, mitogenomes in particular because of the 
small size and high-copy-number, has become fast and 
easy. 

Nearly all traditional (Sanger) [1-3,8,11-15] and still 
some NGS [16] approaches for sequencing mitogenomes 
rely on long range PCR amplification. Due to the often 
degraded state of the DNA, this by and large excludes the 
use of specimens from natural history collections. An 
alternative is the hybridization capture approach [17], but 
this requires a priori sequence knowledge in order to 
design probes. Although enrichment of the mitochondrial 
fraction by long range PCR will increase the chance of 
obtaining a complete mitogenome (and facilitate the use 
of multiple specimens if sequence tags are exploited; [18]), 
it is not essential to NGS (e.g. [19-21]). In fact, the first 
step in NGS library preparation is fragmentation of the 
DNA. Consequently, DNA sequence data might be 
obtained from specimens of natural history collections 
with NGS, where PCR-based approaches fail. Whether 
NGS will allow, e.g. the recovery of complete mitogen- 
omes from collection specimens, will depend on various 
parameters such as: the extent of DNA degradation, the 
ratio of nuclear to extrachromosomal DNA (which 
depends on the size of the genomes as well as on the type 
of tissue selected) and the number and length of the 
obtained sequences (dependent on the selected NGS plat- 
form). To the best of our knowledge, mitogenomes from 
NGS studies are thus far obtained either with the use of a 
long PCR enrichment procedure [16] prior to the NGS 
run, or with traditional Sanger sequencing after the run 
(to close the gaps remaining after assembly of the mito- 
genome, or to get an acceptable coverage) [19,20,22]. 
Since each of these approaches relies on PCR, both can be 
impracticable for (fragmented) DNA retrieved from 
museum specimens. With this study we wanted to test 
whether it would be feasible (using the Illumina GAIIx 
platform) to obtain a complete mitogenome, without PCR, 
from a single museum specimen. 

We selected C. obtusus because it is an interesting 
species from both a morphological and a biogeographic 
point of view. C. obtusus is endemic to the Austrian 
Alps where it can be found in calcareous areas [23] at 
altitudes nearly always above 1,600 m. It has a disjunct 
distribution, probably mirroring the small insular areas 
('nunataks'; [24]) in which it survived the last glacial 
maximum (LGM). Except for some Late Pleistocene spe- 
cimens [25] there is no fossil record for C. obtusus. 
Cylindrus constitutes a monotypic genus. Within the 
helicid subfamily Ariantinae it is aberrant by being the 
only species with a cylindrical shell (Figure 1); other 
members of this speciose subfamily have broadly 
depressed (e.g. Campylaea, Helicigona and Chilostoma) 
or globular {Arianta arbustorum) shells (Figure 2 in 
[26]). 
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Figure 1 Map of the mitochondrial genome of Cylindrus 
obtusus (GenBank accession nr. JN 107636). Genes on the outer 
circle are transcribed clockwise; genes on the inner circle are 
transcribed counterclockwise. TRNAs are denoted by their one-letter 
abbreviations. Regions I, II and III are regions that could not be 
assigned to any mitochondrial gene; Region III could be the most 
likely location for the mitochondrial control region (see Discussion). 



This study shows that NGS can aid in the retrieval of 
sequence data (here a complete mitogenome) without 
using PCRs. Due to DNA degradation, PCRs are often a 
bottleneck for museum specimens. Based on our results 
for a specimen that has been in 70% ethanol for 8 years, we 
plea that NGS could be a promising technique for obtain- 
ing sequence data from museum specimens. We report the 
third complete mitogenome for a species of terrestrial snail 
ever published and compare our Illumina GAIIx strategy 
for sequencing mitogenomes with a similar study [20] in 
which the 454 platform of Roche was deployed. 

Methods 

Collection and preservation 

Specimens of C. obtusus were collected by J. Gould in 
2001 in Grofier Buchstein (3.5 km NW of Gstatterbo- 
den), Ennstaler Alpen (Austria; 47°37' N 14°36' E) at an 
elevation of 2,200 m. After collection the specimens 
were drowned in water and subsequently placed in etha- 
nol 70%. Finally they were stored in the molluscan wet 
collection of NCB Naturalis under collection number 
RMNH. MOL. 144846. 

DNA extraction and quality assessment 

DNA was extracted from a single specimen with a 
DNeasy Blood and Tissue kit (Qiagen). Apart from 
using a total of 40 ul of Proteinase K and overnight 
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lysis, the manufacturer's instructions were followed. The 
DNA concentration of the extract was measured on a 
Nanodrop 1000 spectrophotometer (Thermo Scientific) 
and checked on an agarose gel. 

Confirmation of NGS output 

Because only a small number of C. obtusus (microsatel- 
lite) sequences were present in GenBank [27], we tried 
to sequence COI and CytB which would allow the iden- 
tification of mitochondrial contig sequences (expected 
GAIIx output). To do so, the following primers were 
selected: L1490 & H2198 for COI [28] and UCYTB151F 
& UCYTB270R for CytB [29]. PCRs were performed in 
25 ul volumes using 1.5 mM MgCl 2 , 0.2 mM dNTPs, 
0.4 mM of each primer and 0.25 ul (1.25 units) of Taq 
DNA polymerase (Qiagen). A thermoprofile of 3 min. at 
94°C, followed by 40 cycles of 15 sec. at 94°C, 30 sec. at 
50°C and 40 sec. at 72°C, and a final extension of 5 min. 
at 72°C was used for both markers. The PCR products 
were sent to Macrogen Europe (Amsterdam) where they 
were purified with a Montage purification kit (Millipore) 
and sequenced in both directions (using the same pri- 
mers that were used for PCR) on an ABI3730XL. Con- 
tigs of forward and reverse sequences were assembled 
with Sequencher v. 4.10.1. 

Assessing limitations of the DNA extract: long range PCR 

Although the aim of this study is to assess the possibility 
of sequencing a complete mitogenome without PCRs 
from a collection specimen, the underlying assumption 
is that enrichment of the target sequence(s) by primer- 
directed amplification will be difficult or impossible for 
these kinds of object. To test this assumption, we tried 
to enrich the mitochondrial fraction of the obtained 
DNA extract by means of long range PCR. To inrease 
the chance of successfully amplifying the complete mito- 
chondrion, two Cylindrus specific primer sets (A and B) 
were designed that each amplified (roughly) half of the 
mitochondrion. These two primer-sets were tested with 
the "Expand Long Template PCR System" of Roche 
(Cat. No. 11 681 834 001), following the manufacturer's 
protocol. Primer-sets A and B were designed with Pri- 
mer3 [30] and face outward of the obtained COI and 
CytB sequences: 

A-Cobt-COI- 5'-TTACAACTATTTTTAATATGC 
GTTCTCCT-3' & 

A-Cobt-CB- 5'-CGACGAGAAATAAAACATTTAA- 
CATAACTA-3' and 

B-Cobt-CB- 5'- TACCTTTTGTGATTAGTGTTT 
TTGTGTTAT-3' & 

B-Cobt-COI- 5'- TATTATTTATCCGGGGAAACCT- 
TATATC-3' 

We assumed that the orientation of COI and CytB 
would be identical to that of C. nemoralis. 



GAIIx library preparation 

For DNA extracts from fresh tissues, the first step in 
library preparation is fragmentation of (genomic or PCR 
amplified) DNA. For extracts from museum specimens, 
DNA can already expected to be fragmented, which 
would make this step unnecessary or even detrimental 
([17] and references therein). The extent of DNA degra- 
dation will depend heavily on the preservation history. 
Based on the quality assessment of our DNA extract 
and the adverse effect that improperly fragmented DNA 
has on GAIIx runs, we decided to follow a general 
library preparation procedure, for which we used the 
NEBNext™ DNA Sample Prep Kit (E6000-L, New Eng- 
land BioLabs). The DNA extract was randomly sheared 
with a nebulizer (K7025-05, Invitrogen) for 6 min. at 2.4 
bar (35 psi) to obtain fragments in the range of 200-600 
nucleotides. Fragments with a length of ca. 300 bp 
(insert-length without adaptors approx. 200 nucleotides) 
were extracted from an MS8-agarose gel (Talron Bio- 
tech. L.T.D.) with a Zymoclean Gel DNA Recovery Kit 
(Zymo Research, Orange, CA, USA). For all subsequent 
steps the NEBNext™ DNA Sample Prep Kit protocol 
was followed. 

GAIIx run and data analysis 

The prepared library was run at BaseClear B.V. (The 
Netherlands) on a single lane of a GAIIx flow cell. CLC 
Genomics Workbench version 4.0 (CLC Bio, Cambridge, 
MA, USA) was used to filter the output and for de novo 
assembly. 

Annotation of the mitogenome 

The contig sequence was annotated, based on similarity 
with available sequences in GenBank (Blast searches; 
http://www.ncbi.nlm.nih.gov:80/BLAST/), using pairwise 
alignments with, in particular, C. nemoralis and with the 
organellar genome annotation program DOGMA [31]. 
In order for DOGMA to detect all potential tRNAs (22 
expected) we ultimately set the COVE-score to zero, but 
even then still some were missing. Therefore, the pro- 
grams tRNAscan [32] and ARWEN [33] were used as 
well. 

Results 

DNA extraction 

The DNA extract for the selected C. obtusus specimen 
had a yield 28 ug (200 ul buffer AE; concentration 139.8 
ng/ul) and degradation as judged on the agarose gel was 
limited, considering the age of the specimen. 

PCR assessment 

The PCRs for the 396 bp and 705 bp (including primer 
sequences) fragments of CytB and COI, respectively, 
worked on the first attempt and were directly sequenced. 
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We failed to make the long PCRs work; with our C. obtu- 
sus specific primers (see Methods: Assessing limitations 
of the DNA extract), nor with any combination of the 
"universal" primers that successfully amplified the shorter 
COI and CytB sequences (L1490 & UCYTB270R; 
UCYTB151F & H2198). 

Illumina GAIIx run 

The GAIIx run resulted in 34,174,164 reads with an 
average read length of 50 nucleotides. Of these 685,537 
reads were overlapping and used for a de novo assembly 
(CLC Bio version 4.0). This resulted in 740 contigs with 
a total length of 478,878 bp. The largest contig had a 
length of 14,610 nucleotides and an average coverage of 
26.65 x. The latter contig was identified as the mitogen- 
ome of C. obtusus based on the expected length 
(roughly 14 kb), the presence of the COI and CytB 
sequences (see Methods: Confirmation of NGS output) 
and on similarity with mitochondrial sequences from 
other stylommatophorans as resulting from Blast 
searches. 

Initial assignment of PCGs and rRNAs 

Twelve of the expected 13 protein-coding genes (PCGs), 
as well as both of the ribosomal RNAs (rRNAs) were 
recognized by DOGMA [31]; ATP8 had to be located 
based on a pairwise alignment with C. nemoralis and A. 
caerulea. Although the gene arrangement (of the PCGs 
and rRNAs) as assigned by DOGMA seemed correct 
(compared to the gene arrangement for C. nemoralis), 
the program had difficulties determining the gene bound- 
aries (most likely due to the absence of similar sequences 
on GenBank). Since we lack data from peptide sequen- 
cing for any of the PCGs, the putative gene boundaries 
(Table 1) were determined based on pairwise alignments 
with the amino acid sequences of C. nemoralis, A. caeru- 
lae and E. herklotsi. Nine of the PCGs start with a com- 
mon initiation codon (ATA 5x; ATG 4x); the other four 
start with less common (but not unique for invertebrates) 
initiation codons (TTG 2x; ATC lx and GTG lx). For 
four PCGs (unrelated to the four just mentioned) we had 
to infer that they ended with a truncated termination 
codon (that is, the stop codon is most likely generated by 
posttranscriptional polyadenylation; [5,34]). 

Annotation of rRNAs 

The conserved regions at the beginning and end of both 
rRNAs (described in Figure 4 in [5]) were found in the 
contig sequence of C. obtusus as well. However, for both 
A. caerulea and C. nemoralis, the annotation of rrnS 
(12S) and rrnL {16S) extends beyond these conserved 
regions. Even though the exact gene boundaries of these 
ribosomal genes need to be confirmed by transcript 
mapping, the sequence data for A. caerulea and C. 



nemoralis show little to no space between the rRNAs 
and the surrounding tRNAs. Consequently, we based 
the putative boundaries of 12S and 16S (Table 1) on 
alignments with sequences of the just mentioned species 
for those genes and the position of the surrounding 
tRNAs (described in the next paragraph). 

Annotation of tRNAs 

Because of the low COVE-score, half of the tRNAs 
assigned by DOGMA (17 out of 34) were false positives, 
but even with these relaxed settings tRNA-P(Pro), -G 
(Gly), - S 2 (Ser), - I(Ile) and -K(Lys) were missed. None 
of the missing tRNAs could be detected with tRNAscan. 
Using the least restrictive parameters, tRNAscan yielded 
no false positives, but merely detected four tRNAs (Li 
(Leu), -N(Asn), -M(Met) and -T(Thr)) that were already 
assigned by DOGMA. Of the three programs tested, 
only Arwen found 20 of the 22 tRNAs at the cost of 
just one false positive. When the output of ARWEN and 
DOGMA was combined, all tRNAs except tRNA-G(Gly) 
were assigned. Apart from a swap between tRNA-P and 
tRNA-A(Ala), the gene order for C. obtusus is identical 
to that of C. nemoralis. Alignments that included 
sequences of both A. caerulea and C. nemoralis [6,7] 
showed that tRNA-G is located between tRNA-W(Trp) 
and tRNA-H(His). In the annotated mitochondrion of 
C. obtusus (Figure 1), there is indeed an unassigned 
region between tRNA-W and tRNA-H. An alignment 
with sequences of C. obtusus, A. caerulea and C. nemor- 
alis showed a conserved sequence, TACCTTCCAAG 
(8797-8809) within this non-annotated region of C. 
obtusus, which represents the anticodon loop and part 
of the anticodon stem of tRNA-G. Consequently, this 
alignment was used to infer the approximate location of 
tRNA-G, even though the secundary structure for this 
tRNA could not be predicted. Additionally, the location 
of each tRNA was confirmed by the presence of the 
anti-codon. A map of the mitogenome of C. obtusus is 
depicted in Figure 1. A summary of the mitochondrial 
genome content is given in Table 1. The corresponding 
annotated sequence was deposited in Genbank under 
accession number JN107636. 

Discussion 

At present, NGS techniques are being used increasingly 
in mitogenomic studies. One of the commonly used 
platforms is Roche 454 [16,19,20,22]. The ability to gen- 
erate longer reads (which facilitates de novo assembly) 
and the reduction in sequence costs as compared to 
Illumina GAIIx, undoubtedly has added to the popular- 
ity of this platform. An overview of the costs and read 
lengths of these (and other) NGS platforms is given in 
Glen et al. [35]. When sequencing mitogenomes from 
museum specimens [21,36] the ability to sequence 
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Table 1 Summary of the mitochondrial genome content of C.obtusus 
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longer DNA fragments likely is of no advantage and 
such studies generally rely on platforms that are opti- 
mized for short fragments, such as Illumina GAIIx. 
Despite the momentum that NGS has provided for 
ancient DNA research [17], the number of mitogenomic 
studies that actually use these new techniques to exploit 
natural history collections still is rather limited. This 



paper provides an example of how NGS technology can 
be used to retrieve genetic information from a museum 
specimen. We show that it is feasible to sequence a 
complete mitogenome, without PCR, from a snail that 
has been in 70% ethanol for eight years. 

Thus far, to the best of our knowledge, only [20] made 
use of an NGS platform to sequence the complete 



Groenenberg et al. BMC Genomics 2012, 13:1 14 
http://www.biomedcentral.eom/1 471 -2 1 64/1 3/1 1 4 



Page 6 of 10 



mitogenome of another gastropod. In that study, a simi- 
lar approach (no prior enrichment of the mitochondrial 
fraction) was taken, albeit with freshly collected speci- 
mens and a different NGS platform (Roche 454-FLX). 
Despite the commonalites, there were some noteworthy 
differences between both studies as well. Firstly, [20] 
used 13 specimens to obtain a complete mitogenome, 
whereas our results were obtained from a single speci- 
men. The rationale behind using this many specimens is 
not given by the latter authors; perhaps they wished to 
account for intraspecific heterogeneity between the 
selected populations. It had nothing to do with the 'sen- 
sitivity' of the different platforms; the amount of DNA 
that Feldmeyer et al. [20] used for their sample prepara- 
tion (6 ug), was roughly similar to what was used in this 
study (5 ug). The ability to obtain a complete mitoge- 
nomic sequence from a single specimen obviously is an 
advantage. Secondly, in contrast to our GAIIx run, the 
454 run did not cover the complete mitogenome, 
requiring the design of additional primers and Sanger 
sequencing to close the three gaps that were left over 
after assembly. This is also observed in other mitoge- 
nomic studies in which the 454 platform was used with- 
out prior enrichment of the mitochondrial fraction 
[19,22]. After filtering, the 454 run resulted in 114 reads 
with an average length of 318 nt [20] that could be 
assigned to the mitochondrial genome, compared to 
7,808 reads (out of 34,174,164) with an average length 
of 50 nt obtained with GAIIx. Thus the 454 run resulted 
in 36,252 mitogenomic nucleotides, whereas our GAIIx 
run yielded 390,400. The maximum coverage obtained 
with 454 and GAIIx is 2.6x and 26. 7x for R. balthica 
and C. obtusus for mitogenome sizes of 13,993 bp and 
14,610 bp, respectively. Generally longer reads as 
obtained with the 454 platform facilitate de novo assem- 
bly and might be preferred when little a priori sequence 
information is available ([37] and references therein). 
When reconstructing the just mentioned mitogenomes 
the sheer number of short reads generated by the GAIIx 
platform outcompeted the smaller number of longer 
reads as obtained with 454 sequencing. 

Identification of tRNAs in nematodes [38] and stylom- 
matophorans [4-6] can be difficult because the standard 
cloverleaf secondary structure may not be present (T or D 
arms can be lacking; see tRNA-H, -S>i and -S2 in Figure 2) 
and pulmonate tRNAs can undergo post-transcriptional 
processing [7,39]. In a number of instances different 
tRNAs were predicted in the same approximate nucleotide 
region, depending on the algorithm used. The hypothetical 
tRNAs differed a few nucleotides in length, or were shifted 
a few bases, or both, causing a slight shift in the anti- 
codon region. Despite the fact that only one of the tRNAs 
will be real, both algorithms correctly point to (roughly) 
the same nucleotide region for the placement of a tRNA. 



In other instances, the different algorithms predicted 
tRNAs on exactly the same position but on opposite 
strands, causing the tRNAs to be the reverse-complement 
of each other [40] . Examples of the latter within C. obtusus 
are predictions by DOGMA of tRNA-W and tRNA-C on 
position 3632-3694 and 3232-3293, for which ARWEN 
predicted tRNA-P and tRNA-A, respectively. For both 
tRNAs, only those predicted by ARWEN were in agree- 
ment with existing annotations for related species in 
GenBank. 

Although gene rearrangements are common in the 
mitogenomes of molluscs [7,8] and gastropods in particu- 
lar [2,3,41], little is known about the mitochondrial gene 
organisation of terrestrial snails. The mt gene arrange- 
ments depicted by Yamazaki et al. [6] and Boore et al. [8] 
show less similarity between C. nemoralis and E. herklotsi 
(both species belonging to the Helicoidea), than between 
each of them and A. caerulea (a species belonging to the 
Clausilioidea). Based on a three-taxon statement Yama- 
zaki et al. [6] concluded that the rearrangement of the 
tRNAs between COII and ATP8 represented a derived 
state in E. herklotsi. Similarly, they concluded that the 
positions of tRNA-P and the gene-region tRNA- T/ CO/// 
represented a derived state in C. nemoralis. By comparing 
these gene rearrangements with the gene order observed 
in C. obtusus, we can get some insight in the evolution of 
the mitochondrial gene order of the Helicoidea. Figure 3 
gives an overview of the gene organisation of the four sty- 
lommatophoran mitogenomes currently known. Starting 
at COI the first observed rearrangement is the relocation 
of tRNA-P within C. nemoralis. Although the location of 
tRNA-P in C. obtusus is the same as that found within 
the other stylommatophorans, the gene region itself 
seems to be of high potential for rearrangements within 
the Helicidae. In C. obtusus we see that instead of tRNA- 
P, tRNA-A has been relocated between ND6 and ND5. 
Thus within the Stylommatophora, the relocation of a 
tRNA from the A, P-region to between ND6 and ND5 
seems to be a synapomorphy for the Helicidae. The sec- 
ond gene rearrangement (Figure 3) is found in E. herk- 
lotsi within a series of tRNAs located between COII and 
ATP8. Despite the fact that these tRNAs (Y, W, G, H, Q, 
L 2 ) have rearranged frequently during gastropod evolu- 
tion [2,3], the gene order seems rather conserved within 
the Eupulmonata. The five eupulmonates currently listed 
in the genome database of NCBI and C. obtusus all show 
the order Y, W, G, H, Q, L 2 . Therefore, we endorse the 
conclusion of Yamazaki et al. [6] that the arrangement of 
these tRNAs as observed in E. herklotsi can be considered 
a derived state (and possibly represents an apomorphy 
for the Bradybaenidae). Presence of the ancestral tRNA 
arrangement in both C. obtusus and C. nemoralis, sug- 
gests that that gene order has not changed in the Helici- 
dae. The last gene rearrangement is the relocation of the 
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Figure 2 Potential secondary structures of 21 inferred tRNAs of Cylindrus obtusus mtDNA. Except for tRNA-Y, which was predicted by DOGMA, 
all tRNAs were predicted by ARWEN. The secondary structure for tRNA-G is missing, because it could not be predict by any of the programs tested. 



gene-region tRNA-T ICOIII in C. nemoralis from between 
ND4 and ND2 to between ND3 and ND4. Exactly the 
same rearrangement is also observed in C. obtusus, likely 
indicating an apomorphy for the Helicidae. 



Most metazoan mitogenomes possess a single major 
non-coding region presumed to contain the signals for 
replication and transcription. This region is usually 
referred to as the control region [42]. In some groups of 
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Figure 3 Comparison of the gene order of the four known stylommatophoran mitogenomes For each mitogenome genes above the 
horizontal line are transcribed from left to right; genes below the horiziontal line are transcribed from right to left. TRNAs are denoted by their 
one-letter abbreviations. The diagonal lines indicate mitochondrial gene rearrangements between the specified taxa. In the gene-map of C. 
obtusus, l-lll indicate three unassigned regions (see Figure 1 and Discussion). Gene sizes are not drawn to scale. 



invertebrates, such as gastropods [2,5] and spiders [43] 
the mitogenomes can be very compact, hardly leaving 
any non-coding regions of significant length. Although 
the mitogenome of C. obtusus (14,610 bp) is still com- 
pact compared to that of other Metazoa (approx. 15-24 
kb; [44]), it is about half a kb larger than the mitogen- 
omes of C. nemoraiis (14,100 bp) and A. caerulea 
(14,130 bp). The schematic overview of this mitogenome 
(Figure 1) shows three unassigned regions (indicated as 
/, II and III) with a length of 394, 181 and 189 nt, 
respectively. When the mitochondrial gene order of C. 
obtusus is compared with that of other stylommatophor- 
ans currently known (Figure 3), it becomes clear that 
region / and 77/ coincide with the transposition of 
tRNA-A and the region tRNA-T/C0777. As for region 77, 
no gene rearrangement was observed in C. obtusus and 
neither C. nemoraiis nor A. caerulea have any unas- 
signed sequence between tRNA-W and tRNA-G. But 
then E. herklotsi does show a gene rearrangement in 
that region (Figure 3). In stylommatophorans (or all gas- 
tropods for that matter) the location of the mitochon- 
drial control region is still a subject of debate. Given the 
absence of region 77 in other stylommatophorans, we 
believe that of the three non-coding regions, region 77 is 
the least likely location for the control region. As for 
region 7, Grande et al. [9] suggested that the region 
between ND6 and ND5 might contain recognition sig- 
nals for transcription in the nudibranch Roboastra euro- 
paea. Except for Onchidella celtica, which, like C. 
obtusus, has its longest non-coding sequence here [2], 
most of the heterobranch gastropods sequenced thus far 
show very little unassigned sequence between ND6 and 
ND5. Therefore we assume that region 7 is not the most 
likely location for the control region either. Within the 



heterobranch gastropods the region between COIII and 
tRNA-I is most often cited as the potential location for 
the control region [2,3,9,45]. Because of the shown 
transposition of tRNA-T/ COIII within C. nemoraiis and 
C. obtusus (Figure 3), this region was transposed as well. 
Thus for those species (and likely all Helicidae), the 
potential control region (the region adjacent to COIII) is 
not located between COIII and tRNA-I, but between 
COIII and tRNA-S^ For A. caerulea and Pupa strigosa, 
the presence of 10 nt (or 20 nt if the 5 nt overlap with 
tRNA-I is included) and 25 nt palindromes, respectively 
[5,45], has been implicated to function as a bidrectional 
promotor for this putative control region. Within 
C. obtusus, C. nemoraiis and E. herklotsi, the palin- 
dromes found in this region were never larger than six 
nucleotides. We were unable to confidently align the 
putative control region for the four mentioned stylom- 
matophorans. The length of this region was longer in the 
Helicidae (C. obtusus = 189 nt; C. nemoraiis = 158 nt) 
than in the other two stylommatophorans (A. caerulea = 
42 nt; E. herklotsi = 43 nt), which is most certainly related 
to the transposition of tRNA-T/C0777. Besides the appar- 
ent consistency of the presence of a non-coding region 
adjacent to COIII, there still is little to go on for recogni- 
tion of the control region in heterobranch gastropods. 
Given the limited number of available stylommatophoran 
(and even eupulmonate) mitogenomes (GenBank), we 
consider more extensive genomic comparisons (such as 
[46]) to be premature, based on these data. 

Although this study illustrates the potential of NGS to 
obtain genetic information from museum specimens, 
there are some caveats that need to be addressed. As for 
the results of the long range PCR, we did not have a 
recently collected specimen of C. obtusus at our disposal. 
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Therefore, we have not shown that the failure of the long 
range PCR was caused by the fact that we used an 'aged' 
collection specimen. Otherwise, no products would have 
been obtained with the long range PCRs either, based, as 
they were, on the universal primers, which were shown 
to work with C. obtusus. The fact that an NGS approach 
worked well in this study, also does not imply that NGS 
approaches will always be fruitful when applied to collec- 
tion specimens. A vast number of parameters such as the 
effect of fixatives, time (the "age" of specimens) and pre- 
servation history were not assessed in this study. Also 
annotation of a mitogenome largely based on similarity 
with available sequences in GenBank (as opposed to tran- 
script mapping or peptide sequencing) is hazardous. Clo- 
sely related species might not be present and existing 
annotations are not guaranteed to be flawless [47,48]. 
Another hypothetical problem is that due to the relative 
short length of the obtained sequences (50 nucleotides 
on average) repeats within the mitogenome could be 
missed. The mitogenomes of gastropods however are 
very compact, none of stylommatophorans sequenced 
thus far show such repeats and the length of the com- 
plete mt sequence is similar to that of other stylommato- 
phorans. We therefore assume that the mitogenome of 
C. obtusus presented here is complete. Based on the 
sheer number of sequences generated with GAIIx and 
454, we are convinced that without PCRs it will be more 
difficult to obtain a complete mitogenome with the latter 
platform (despite the longer reads). Even though our 
GAIIx approach is reasonably similar to the 454 
approach described by Feldmeyer et al. [20], the compar- 
ison will not be conclusive as long as the total genome 
sizes of C. obtusus and R. balthica are unknown. The 
average genome size (http://www.genomesize.com) of 
available stylommatophorans (2.86 c or 2.79 GB) com- 
pared to basommatophorans (1.34 c or 1.31 GB) neverthe- 
less suggest that it will be more difficult to obtain the 
mitogenome of C. obtusus than that of R. balthica. It is 
likely that the assembly of mitogenomes will benefit from 
the advances in NGS technologies (e.g. the Illumina HiSeq 
platform), as well as from the promising arrival of third 
generation sequencers (e.g. the PacBio RS platform). 

Conclusions 

On a par with previous studies [21,36], this study shows 
that NGS can aid in the retrieval of mitogenomes from 
museum specimens. Although sequencing of mitogen- 
omes by means of NGS without an enrichment proce- 
dure is very inefficient (only 0.02% of the reads from our 
GAIIx run were used for assembly of the mitogenome), it 
eliminates the use of PCRs which is often a bottleneck 
for degraded DNA samples. Without prior enrichment of 
the mitochondrial fraction, the GAIIx platform (Illumina) 
might be better suited for de novo sequencing of 



mitogenomes than the 454 platform (Roche). Besides 
being much faster than conventional sequencing (which 
generally results in 2x coverage), sequencing of mitogen- 
omes by means of NGS also yields higher confidence esti- 
mates (on average 26x times coverage, in this study). 
Except for a swap between tRNA-P and tRNA-A, the 
mitochondrial gene arrangement of C. obtusus is identi- 
cal to that of C. nemoralis. Within the Helicidae the 
region tRNA-L^ P, A might be a hot spot for transposi- 
tion of genes (in particular to the region between ND6 
and ND5). The location of tRNA-T/CO/tf between ND3 
and ND4 (instead of between ND4 and ND2) might be 
an apomorphy for the family Helicidae. We hope that the 
results of this study will aid to future studies on stylom- 
matophoran evolution and the phylogeny of the subfam- 
ily Ariantinae in particular. 
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